A New Data Replication Scheme for PVFS2
نویسندگان
چکیده
PVFS is one of the most popular distributed file systems with parallelism, which is still widely used today. Now PVFS is in its version 2, called PVFS2. PVFS2 has a leading performance on I/O operations, but the reliability and stability are not as good. One of the reasons is the lack of data replication. This paper presents a new data replication scheme in PVFS2. In our approach, the backup operation is done on the servers, therefore the user experience is not affected while creating copies of files. In addition, we optimized the read operation of PVFS2. With copies, we can choose the servers to read from, so we can maintain parallelism of read operation under complex conditions such as a server is down or the load of some servers are obviously higher than others. Experimental results verify the effectiveness and efficiency of our method.
منابع مشابه
A Non-MDS Erasure Code Scheme for Storage Applications
This paper investigates the use of redundancy and self repairing against node failures indistributed storage systems using a novel non-MDS erasure code. In replication method, accessto one replication node is adequate to reconstruct a lost node, while in MDS erasure codedsystems which are optimal in terms of redundancy-reliability tradeoff, a single node failure isrepaired after recovering the ...
متن کاملProject Report for Project Shared Parallel File System
In recent years, shared parallel file system has been a new hot area for the high throughput computing. Many systems have been developed for this purpose, which include PVFS2 and GFS. These systems were being studied in my project. In this project, the concept of such a shared parallel file system has been built up by installing it on four nodes of the LISA cluster at SARA (Stichting Academisch...
متن کاملImproving Data Availability Using Combined Replication Strategy in Cloud Environment
As grow as the data-intensive applications in cloud computing day after day, data popularity in this environment becomes critical and important. Hence to improve data availability and efficient accesses to popular data, replication algorithms are now widely used in distributed systems. However, most of them only replicate the static number of replicas on some requested chosen sites and it is ob...
متن کاملAn Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity
The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...
متن کاملIncreasing performance in Data grid by a new replica replacement algorithm
Data Grid provides sharing services for very large data around the world. Data replication is one of the most effective approaches to reduce access latency and response time. In addition to the benefits, replication has costs such as storage and bandwidth consumption, especially when storage space is low and limited. Therefore, the data replacement should be done wisely. In this p...
متن کامل